Skip to content

Conversation

@jianzhangbjz
Copy link
Member

@jianzhangbjz jianzhangbjz commented Oct 24, 2025

Description of the change:

Problem

Prometheus fails to scrape OLM metrics with HTTP 500 errors. Analysis shows the root cause is:

  E1024 08:45:54 Failed to make webhook authenticator request:
  Post "https://172.30.0.1:443/apis/authentication.k8s.io/v1/tokenreviews":
  tls: failed to verify certificate: x509: certificate signed by unknown authority

The authentication filter uses TokenReview API to validate bearer tokens from Prometheus, but the API call fails due to TLS certificate verification errors.

Motivation for the change:

Architectural changes:

  • Extracts CA certificates from config.CAData or config.CAFile
  • Creates a TLS transport with proper root CA pool
  • Returns a fully configured client that can verify API server certificates

Testing remarks:

  • Added comprehensive unit tests (10 test cases, all passing)
  • Tests verify TLS configuration is properly applied
  • Tests demonstrate the bug with empty http.Client
  • Tests validate CA certificate loading

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Bug fixes are accompanied by regression test(s)
  • e2e tests and flake fixes are accompanied evidence of flake testing, e.g. executing the test 100(0) times
  • tech debt/todo is accompanied by issue link(s) in comments in the surrounding code
  • Tests are comprehensible, e.g. Ginkgo DSL is being used appropriately
  • Docs updated or added to /doc
  • Commit messages sensible and descriptive
  • Tests marked as [FLAKE] are truly flaky and have an issue
  • Code is properly formatted

Assisted-by: Claude Code

@jianzhangbjz
Copy link
Member Author

Hi @tmshort , could you help take a look? Thanks!

@anik120
Copy link
Member

anik120 commented Oct 24, 2025

Looks like the e2e tests are passing here, I've created a PR to run the metrics e2e test with these changes #3687

Copy link
Member

@anik120 anik120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

metrics e2e tests are passing in #3687 with these changes

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 24, 2025
@openshift-ci
Copy link

openshift-ci bot commented Oct 24, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: anik120

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 24, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit bdbfb2d into operator-framework:master Oct 24, 2025
19 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants